首页> 外文OA文献 >PreBIND and Textomy - Mining the Biomedical Literature for Protein-Protein Interactions Using a Support Vector Machine
【2h】

PreBIND and Textomy - Mining the Biomedical Literature for Protein-Protein Interactions Using a Support Vector Machine

机译:PreBIND和Textomy-使用支持向量机挖掘蛋白质相互作用的生物医学文献

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Background The majority of experimentally verified molecular interaction and biological pathway data are present in the unstructured text of biomedical journal articles where they are inaccessible to computational methods. The Biomolecular interaction network database (BIND) seeks to capture these data in a machine-readable format. We hypothesized that the formidable task-size of backfilling the database could be reduced by using Support Vector Machine technology to first locate interaction information in the literature. We present an information extraction system that was designed to locate protein-protein interaction data in the literature and present these data to curators and the public for review and entry into BIND. Results Cross-validation estimated the support vector machine's test-set precision, accuracy and recall for classifying abstracts describing interaction information was 92%, 90% and 92% respectively. We estimated that the system would be able to recall up to 60% of all non-high throughput interactions present in another yeast-protein interaction database. Finally, this system was applied to a real-world curation problem and its use was found to reduce the task duration by 70% thus saving 176 days. Conclusions Machine learning methods are useful as tools to direct interaction and pathway database back-filling; however, this potential can only be realized if these techniques are coupled with human review and entry into a factual database such as BIND. The PreBIND system described here is available to the public at http://bind.ca webcite. Current capabilities allow searching for human, mouse and yeast protein-interaction information.
机译:背景技术大多数经过实验验证的分子相互作用和生物途径数据均存在于生物医学期刊文章的非结构化文本中,在这些文章中,计算方法无法获得它们。生物分子相互作用网络数据库(BIND)试图以机器可读格式捕获这些数据。我们假设可以通过使用支持向量机技术首先在文献中定位交互信息来减少回填数据库的艰巨任务。我们提供了一个信息提取系统,该系统旨在查找文献中的蛋白质-蛋白质相互作用数据,并将这些数据提供给策展人和公众以供审查和输入BIND。结果交叉验证估计支持向量机对描述交互信息的摘要进行分类的测试集精度,准确性和召回率分别为92%,90%和92%。我们估计该系统将能够调出另一个酵母-蛋白质相互作用数据库中存在的所有非高通量相互作用的60%。最后,将此系统应用于实际的策展问题,发现使用该系统可以将任务持续时间减少70%,从而节省176天。结论机器学习方法可用作指导交互和途径数据库回填的工具。但是,只有将这些技术与人工检查结合并输入到事实数据库(例如BIND)中,才能实现这种潜力。此处描述的PreBIND系统可从http://bind.ca webcite向公众使用。当前的功能允许搜索人,小鼠和酵母蛋白的相互作用信息。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号